Studying Luxembourgish Phonetics via Multilingual Forced Alignments
نویسندگان
چکیده
Luxembourgish, a Germanic-Franconian language, is embedded in a multilingual context on the divide between Romance and Germanic cultures and remains one of Europe’s under-described languages. This paper investigates the similarity between Luxembourgish phone segments with German, French and English via forced speech alignment techniques. Making use of monolingual acoustic seed models from these three languages, as well as “multilingual” models trained on pooled speech data we investigated whether Luxembourgish was globally better represented by one of the individual languages or by the multilingual model. Although French words are often interspersed in spoken Luxembourgish, forced alignments show a clear preference for Germanic acoustic models, with only a limited usage of the French ones. While globally, the German models provide the best match, a phonebased analysis, shows language-specific preferences: French is preferred for rounded front vowels, nasal vowels and // whereas English is more frequently used for diphthongs. The proposed method enables the acoustic match between phonemes in different languages to be quantified and opens new perspectives in language processing studies for low e-resourced languages and for L2 learning.
منابع مشابه
Comparing mono- & multilingual acoustic seed models for a low e-resourced language: a case-study of luxembourgish
Luxembourgish is embedded in a multilingual context on the divide between Romance and Germanic cultures and has often been viewed as one of Europe’s under-resourced languages. We focus on the acoustic modeling of Luxembourgish. By taking advantage of monolingual acoustic seeds selected from German, French or English model sets via IPA symbol correspondances, we investigated whether Luxembourgis...
متن کاملSpeech alignment and recognition experiments for Luxembourgish
Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual ac...
متن کاملInitializing acoustic phone models of under-resourced languages: a case-study of Luxembourgish
The national language of the Grand-Duchy of Luxembourg, Luxembourgish, has often been characterized as one of Europe’s under-described and under-resourced languages. In this contribution we report on our ongoing work to take Luxembourgish on board as an e-language : an electronically searchable spoken language. More specifically, we focus on the issue of producing acoustic seed models for Luxem...
متن کاملAutomatic language identity tagging on word and sentence-level in multilingual text sources: a case-study on Luxembourgish
Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. This is due to the fact that the written production remains relatively low, and linguistic knowledge and resources, such as lexica and pronunciation dictionaries, are sparse. The speakers or writers will frequently switch between Luxembourgish...
متن کاملA first LVCSR system for Luxembourgish, an under-resourced European language
Luxembourgish is embedded in a multilingual context on the divide between Romance and Germanic cultures and remains one of Europe’s under-described languages. We describe our efforts in building an large vocabulary ASR system for such a “minority” language (target language: Luxembourgish) without any transcribed audio training data. Instead, acoustic models are derived from major languages (sou...
متن کامل